Skip to content

Conversation

@cmaloney
Copy link
Contributor

@cmaloney cmaloney commented Oct 14, 2025

Update bytearray to contain a bytes and provide a zero-copy path to "extract" the bytes. This allows making several code paths more efficient.

This does not move any codepaths to make use of this new API. The documentation changes include common code patterns which can be made more efficient with this API.


When just changing bytearray to contain bytes I ran pyperformance on a --with-lto --enable-optimizations --with-static-libpython build (results below) and don't see any major speedups or slowdowns with this; all seems to be in the noise of my machine (Generally changes under 5% or benchmarks that don't touch bytes/bytearray).

pyperformance compare main.json bytearray_bytes.json

main.json

Performance version: 1.11.0
Report on Linux-6.17.1-arch1-1-x86_64-with-glibc2.42
Number of logical CPUs: 32
Start date: 2025-10-14 00:55:52.519236
End date: 2025-10-14 02:23:01.308400

bytearray_bytes.json

Performance version: 1.11.0
Report on Linux-6.17.1-arch1-1-x86_64-with-glibc2.42
Number of logical CPUs: 32
Start date: 2025-10-13 23:22:29.928152
End date: 2025-10-14 00:49:34.467284

+----------------------------------+-----------+----------------------+--------------+------------------------+
| Benchmark                        | main.json | bytearray_bytes.json | Change       | Significance           |
+==================================+===========+======================+==============+========================+
| 2to3                             | 137 ms    | 136 ms               | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_generators                 | 193 ms    | 195 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_cpu_io_mixed          | 285 ms    | 286 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_cpu_io_mixed_tg       | 289 ms    | 290 ms               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager                 | 50.4 ms   | 51.5 ms              | 1.02x slower | Significant (t=-10.40) |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager_cpu_io_mixed    | 223 ms    | 225 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager_cpu_io_mixed_tg | 263 ms    | 264 ms               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager_io              | 370 ms    | 372 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager_io_tg           | 380 ms    | 384 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager_memoization     | 125 ms    | 126 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager_memoization_tg  | 161 ms    | 162 ms               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager_tg              | 125 ms    | 125 ms               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_io                    | 366 ms    | 360 ms               | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_io_tg                 | 359 ms    | 361 ms               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_memoization           | 177 ms    | 181 ms               | 1.02x slower | Significant (t=-9.20)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_memoization_tg        | 188 ms    | 189 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_none                  | 151 ms    | 151 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_none_tg               | 150 ms    | 151 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| asyncio_tcp                      | 182 ms    | 161 ms               | 1.13x faster | Significant (t=32.85)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| asyncio_tcp_ssl                  | 548 ms    | 553 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| asyncio_websockets               | 342 ms    | 339 ms               | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| bench_mp_pool                    | 7.12 ms   | 7.08 ms              | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| bench_thread_pool                | 818 us    | 819 us               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| bpe_tokeniser                    | 2.10 sec  | 2.09 sec             | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| chaos                            | 27.9 ms   | 28.0 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| comprehensions                   | 7.45 us   | 7.24 us              | 1.03x faster | Significant (t=3.27)   |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| connected_components             | 308 ms    | 309 ms               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| coroutines                       | 11.1 ms   | 11.2 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| coverage                         | 33.6 ms   | 34.1 ms              | 1.02x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| create_gc_cycles                 | 1.16 ms   | 1.16 ms              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| crypto_pyaes                     | 37.1 ms   | 35.6 ms              | 1.04x faster | Significant (t=10.63)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| dask                             | 347 ms    | 351 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| deepcopy                         | 118 us    | 117 us               | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| deepcopy_memo                    | 12.8 us   | 12.7 us              | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| deepcopy_reduce                  | 1.32 us   | 1.34 us              | 1.02x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| deltablue                        | 1.65 ms   | 1.64 ms              | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| django_template                  | 17.9 ms   | 17.8 ms              | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| docutils                         | 1.19 sec  | 1.20 sec             | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| dulwich_log                      | 19.5 ms   | 19.7 ms              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| fannkuch                         | 184 ms    | 181 ms               | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| float                            | 37.1 ms   | 36.7 ms              | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| gc_traversal                     | 3.04 ms   | 2.84 ms              | 1.07x faster | Significant (t=19.48)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| generators                       | 15.9 ms   | 15.3 ms              | 1.04x faster | Significant (t=7.03)   |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| genshi_text                      | 11.3 ms   | 11.2 ms              | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| genshi_xml                       | 25.5 ms   | 25.5 ms              | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| go                               | 57.6 ms   | 56.7 ms              | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| hexiom                           | 2.92 ms   | 2.88 ms              | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| html5lib                         | 26.0 ms   | 26.5 ms              | 1.02x slower | Significant (t=-9.20)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| json_dumps                       | 4.48 ms   | 4.44 ms              | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| json_loads                       | 11.7 us   | 11.7 us              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| k_core                           | 1.41 sec  | 1.42 sec             | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| logging_format                   | 3.27 us   | 3.30 us              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| logging_silent                   | 45.5 ns   | 45.8 ns              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| logging_simple                   | 3.02 us   | 3.01 us              | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| mako                             | 6.02 ms   | 6.03 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| many_optionals                   | 473 us    | 478 us               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| mdp                              | 587 ms    | 578 ms               | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| meteor_contest                   | 50.2 ms   | 50.5 ms              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| nbody                            | 54.6 ms   | 52.4 ms              | 1.04x faster | Significant (t=10.72)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| nqueens                          | 41.7 ms   | 40.4 ms              | 1.03x faster | Significant (t=6.79)   |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pathlib                          | 9.77 ms   | 9.73 ms              | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pickle                           | 5.99 us   | 6.01 us              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pickle_dict                      | 12.5 us   | 12.8 us              | 1.02x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pickle_list                      | 1.98 us   | 1.96 us              | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pickle_pure_python               | 149 us    | 150 us               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pidigits                         | 111 ms    | 115 ms               | 1.03x slower | Significant (t=-18.53) |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pprint_pformat                   | 737 ms    | 748 ms               | 1.02x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pprint_safe_repr                 | 362 ms    | 369 ms               | 1.02x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pyflate                          | 211 ms    | 205 ms               | 1.03x faster | Significant (t=7.43)   |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| python_startup                   | 7.88 ms   | 7.88 ms              | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| python_startup_no_site           | 4.72 ms   | 4.76 ms              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| raytrace                         | 130 ms    | 128 ms               | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| regex_compile                    | 50.0 ms   | 50.2 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| regex_dna                        | 101 ms    | 103 ms               | 1.02x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| regex_effbot                     | 1.72 ms   | 1.77 ms              | 1.03x slower | Significant (t=-26.42) |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| regex_v8                         | 12.5 ms   | 12.3 ms              | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| richards                         | 20.4 ms   | 20.0 ms              | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| richards_super                   | 23.4 ms   | 22.8 ms              | 1.03x faster | Significant (t=11.36)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| scimark_fft                      | 154 ms    | 153 ms               | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| scimark_lu                       | 55.4 ms   | 57.0 ms              | 1.03x slower | Significant (t=-5.67)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| scimark_monte_carlo              | 32.8 ms   | 32.8 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| scimark_sor                      | 57.8 ms   | 56.9 ms              | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| scimark_sparse_mat_mult          | 2.75 ms   | 2.76 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| shortest_path                    | 316 ms    | 318 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| spectral_norm                    | 47.7 ms   | 51.6 ms              | 1.08x slower | Significant (t=-2.01)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sphinx                           | 465 ms    | 467 ms               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sqlglot_v2_normalize             | 50.3 ms   | 50.2 ms              | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sqlglot_v2_optimize              | 24.2 ms   | 24.4 ms              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sqlglot_v2_parse                 | 576 us    | 572 us               | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sqlglot_v2_transpile             | 724 us    | 722 us               | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sqlite_synth                     | 1.14 us   | 1.15 us              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| subparsers                       | 20.6 ms   | 20.7 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sympy_expand                     | 181 ms    | 184 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sympy_integrate                  | 8.54 ms   | 8.55 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sympy_str                        | 103 ms    | 105 ms               | 1.02x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sympy_sum                        | 55.9 ms   | 56.0 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| telco                            | 3.39 ms   | 3.34 ms              | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| tomli_loads                      | 971 ms    | 982 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| typing_runtime_protocols         | 73.2 us   | 73.6 us              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| unpack_sequence                  | 25.2 ns   | 23.0 ns              | 1.10x faster | Significant (t=7.03)   |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| unpickle                         | 6.99 us   | 7.05 us              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| unpickle_list                    | 2.07 us   | 2.10 us              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| unpickle_pure_python             | 105 us    | 104 us               | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| xml_etree_generate               | 40.5 ms   | 40.7 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| xml_etree_iterparse              | 49.7 ms   | 50.4 ms              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| xml_etree_parse                  | 77.2 ms   | 79.1 ms              | 1.02x slower | Significant (t=-16.14) |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| xml_etree_process                | 29.5 ms   | 29.8 ms              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+

This sets up so the bytes can be "taken" as a byes object without
requiring a copy.

I ran pyperformance (results below) and don't see any major speedups
or slowdowns with this; all seems to be in the noise of my machine.

------

pyperformance compare main.json bytearray_bytes.json -O table
main.json
=========

Performance version: 1.11.0
Report on Linux-6.17.1-arch1-1-x86_64-with-glibc2.42
Number of logical CPUs: 32
Start date: 2025-10-14 00:55:52.519236
End date: 2025-10-14 02:23:01.308400

bytearray_bytes.json
====================

Performance version: 1.11.0
Report on Linux-6.17.1-arch1-1-x86_64-with-glibc2.42
Number of logical CPUs: 32
Start date: 2025-10-13 23:22:29.928152
End date: 2025-10-14 00:49:34.467284

+----------------------------------+-----------+----------------------+--------------+------------------------+
| Benchmark                        | main.json | bytearray_bytes.json | Change       | Significance           |
+==================================+===========+======================+==============+========================+
| 2to3                             | 137 ms    | 136 ms               | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_generators                 | 193 ms    | 195 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_cpu_io_mixed          | 285 ms    | 286 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_cpu_io_mixed_tg       | 289 ms    | 290 ms               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager                 | 50.4 ms   | 51.5 ms              | 1.02x slower | Significant (t=-10.40) |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager_cpu_io_mixed    | 223 ms    | 225 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager_cpu_io_mixed_tg | 263 ms    | 264 ms               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager_io              | 370 ms    | 372 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager_io_tg           | 380 ms    | 384 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager_memoization     | 125 ms    | 126 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager_memoization_tg  | 161 ms    | 162 ms               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_eager_tg              | 125 ms    | 125 ms               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_io                    | 366 ms    | 360 ms               | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_io_tg                 | 359 ms    | 361 ms               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_memoization           | 177 ms    | 181 ms               | 1.02x slower | Significant (t=-9.20)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_memoization_tg        | 188 ms    | 189 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_none                  | 151 ms    | 151 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| async_tree_none_tg               | 150 ms    | 151 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| asyncio_tcp                      | 182 ms    | 161 ms               | 1.13x faster | Significant (t=32.85)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| asyncio_tcp_ssl                  | 548 ms    | 553 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| asyncio_websockets               | 342 ms    | 339 ms               | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| bench_mp_pool                    | 7.12 ms   | 7.08 ms              | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| bench_thread_pool                | 818 us    | 819 us               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| bpe_tokeniser                    | 2.10 sec  | 2.09 sec             | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| chaos                            | 27.9 ms   | 28.0 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| comprehensions                   | 7.45 us   | 7.24 us              | 1.03x faster | Significant (t=3.27)   |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| connected_components             | 308 ms    | 309 ms               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| coroutines                       | 11.1 ms   | 11.2 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| coverage                         | 33.6 ms   | 34.1 ms              | 1.02x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| create_gc_cycles                 | 1.16 ms   | 1.16 ms              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| crypto_pyaes                     | 37.1 ms   | 35.6 ms              | 1.04x faster | Significant (t=10.63)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| dask                             | 347 ms    | 351 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| deepcopy                         | 118 us    | 117 us               | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| deepcopy_memo                    | 12.8 us   | 12.7 us              | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| deepcopy_reduce                  | 1.32 us   | 1.34 us              | 1.02x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| deltablue                        | 1.65 ms   | 1.64 ms              | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| django_template                  | 17.9 ms   | 17.8 ms              | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| docutils                         | 1.19 sec  | 1.20 sec             | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| dulwich_log                      | 19.5 ms   | 19.7 ms              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| fannkuch                         | 184 ms    | 181 ms               | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| float                            | 37.1 ms   | 36.7 ms              | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| gc_traversal                     | 3.04 ms   | 2.84 ms              | 1.07x faster | Significant (t=19.48)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| generators                       | 15.9 ms   | 15.3 ms              | 1.04x faster | Significant (t=7.03)   |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| genshi_text                      | 11.3 ms   | 11.2 ms              | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| genshi_xml                       | 25.5 ms   | 25.5 ms              | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| go                               | 57.6 ms   | 56.7 ms              | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| hexiom                           | 2.92 ms   | 2.88 ms              | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| html5lib                         | 26.0 ms   | 26.5 ms              | 1.02x slower | Significant (t=-9.20)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| json_dumps                       | 4.48 ms   | 4.44 ms              | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| json_loads                       | 11.7 us   | 11.7 us              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| k_core                           | 1.41 sec  | 1.42 sec             | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| logging_format                   | 3.27 us   | 3.30 us              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| logging_silent                   | 45.5 ns   | 45.8 ns              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| logging_simple                   | 3.02 us   | 3.01 us              | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| mako                             | 6.02 ms   | 6.03 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| many_optionals                   | 473 us    | 478 us               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| mdp                              | 587 ms    | 578 ms               | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| meteor_contest                   | 50.2 ms   | 50.5 ms              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| nbody                            | 54.6 ms   | 52.4 ms              | 1.04x faster | Significant (t=10.72)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| nqueens                          | 41.7 ms   | 40.4 ms              | 1.03x faster | Significant (t=6.79)   |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pathlib                          | 9.77 ms   | 9.73 ms              | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pickle                           | 5.99 us   | 6.01 us              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pickle_dict                      | 12.5 us   | 12.8 us              | 1.02x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pickle_list                      | 1.98 us   | 1.96 us              | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pickle_pure_python               | 149 us    | 150 us               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pidigits                         | 111 ms    | 115 ms               | 1.03x slower | Significant (t=-18.53) |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pprint_pformat                   | 737 ms    | 748 ms               | 1.02x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pprint_safe_repr                 | 362 ms    | 369 ms               | 1.02x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| pyflate                          | 211 ms    | 205 ms               | 1.03x faster | Significant (t=7.43)   |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| python_startup                   | 7.88 ms   | 7.88 ms              | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| python_startup_no_site           | 4.72 ms   | 4.76 ms              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| raytrace                         | 130 ms    | 128 ms               | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| regex_compile                    | 50.0 ms   | 50.2 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| regex_dna                        | 101 ms    | 103 ms               | 1.02x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| regex_effbot                     | 1.72 ms   | 1.77 ms              | 1.03x slower | Significant (t=-26.42) |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| regex_v8                         | 12.5 ms   | 12.3 ms              | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| richards                         | 20.4 ms   | 20.0 ms              | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| richards_super                   | 23.4 ms   | 22.8 ms              | 1.03x faster | Significant (t=11.36)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| scimark_fft                      | 154 ms    | 153 ms               | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| scimark_lu                       | 55.4 ms   | 57.0 ms              | 1.03x slower | Significant (t=-5.67)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| scimark_monte_carlo              | 32.8 ms   | 32.8 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| scimark_sor                      | 57.8 ms   | 56.9 ms              | 1.02x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| scimark_sparse_mat_mult          | 2.75 ms   | 2.76 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| shortest_path                    | 316 ms    | 318 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| spectral_norm                    | 47.7 ms   | 51.6 ms              | 1.08x slower | Significant (t=-2.01)  |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sphinx                           | 465 ms    | 467 ms               | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sqlglot_v2_normalize             | 50.3 ms   | 50.2 ms              | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sqlglot_v2_optimize              | 24.2 ms   | 24.4 ms              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sqlglot_v2_parse                 | 576 us    | 572 us               | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sqlglot_v2_transpile             | 724 us    | 722 us               | 1.00x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sqlite_synth                     | 1.14 us   | 1.15 us              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| subparsers                       | 20.6 ms   | 20.7 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sympy_expand                     | 181 ms    | 184 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sympy_integrate                  | 8.54 ms   | 8.55 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sympy_str                        | 103 ms    | 105 ms               | 1.02x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| sympy_sum                        | 55.9 ms   | 56.0 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| telco                            | 3.39 ms   | 3.34 ms              | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| tomli_loads                      | 971 ms    | 982 ms               | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| typing_runtime_protocols         | 73.2 us   | 73.6 us              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| unpack_sequence                  | 25.2 ns   | 23.0 ns              | 1.10x faster | Significant (t=7.03)   |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| unpickle                         | 6.99 us   | 7.05 us              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| unpickle_list                    | 2.07 us   | 2.10 us              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| unpickle_pure_python             | 105 us    | 104 us               | 1.01x faster | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| xml_etree_generate               | 40.5 ms   | 40.7 ms              | 1.00x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| xml_etree_iterparse              | 49.7 ms   | 50.4 ms              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| xml_etree_parse                  | 77.2 ms   | 79.1 ms              | 1.02x slower | Significant (t=-16.14) |
+----------------------------------+-----------+----------------------+--------------+------------------------+
| xml_etree_process                | 29.5 ms   | 29.8 ms              | 1.01x slower | Not significant        |
+----------------------------------+-----------+----------------------+--------------+------------------------+
@cmaloney cmaloney changed the title gh-139871: Update bytearray to contain PyBytesObject gh-139871: Implement bytearray.take_bytes([n]) to efficiently extract bytes Oct 15, 2025
@cmaloney cmaloney changed the title gh-139871: Implement bytearray.take_bytes([n]) to efficiently extract bytes gh-139871: Add bytearray.take_bytes([n]) to efficiently extract bytes Oct 15, 2025
@cmaloney
Copy link
Contributor Author

Threading tests found a non-threading issue that after this change ba = bytearray(b'123'); ba.clear(); ba.copy() has slightly different internals (sizeof, alloc) than before. Exploring options.

Co-authored-by: Maurycy Pawłowski-Wieroński <[email protected]>
return PyLong_FromSsize_t(FT_ATOMIC_LOAD_SSIZE_RELAXED(self->ob_alloc));
Py_ssize_t alloc = FT_ATOMIC_LOAD_SSIZE_RELAXED(self->ob_alloc);
if (alloc > 0) {
alloc += sizeof(PyBytesObject);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding in the size of PyBytesObject here (and in sizeof) because ob_alloc is expected by code to be the number of bytes of space available (vs ob_size, the number of bytes in use). Felt more straightforward to me to leave ob_alloc and ob_size definitions as they were and rather add in to the size reporting here.

@cmaloney
Copy link
Contributor Author

@vstinner I think this is ready for another pass; I left github comments around some places I am unsure the CPython standard way to do as well as ones where I'm not sure what the right decision is

cmaloney and others added 3 commits October 26, 2025 23:21
Co-authored-by: Victor Stinner <[email protected]>
1. After __init__ or C construction guarantee ob_bytes_object is set
   by using empty bytes object.
2. In resize place a null terminator mid-buffer only if required
3. Remove now unneded branches
   - n == PY_SSIZE_T_MAX checks are redundant with resize checks.
   - size = 0 is handled by PyBytes_FromStringAndSize
   - No more alloc + 1; exact resize is exact and bytes does +1 for null
   - No downsize to 0 special case since alloc == size there.
if (size == 0) {
new->ob_bytes = NULL;
alloc = 0;
new->ob_bytes_object = PyBytes_FromStringAndSize(NULL, size);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: this now always gets set, but if size=0 then PyBytes_FromStringAndSize doesn't actually allocate / returns the empty bytes object so the optimization, don't allocate when zero-sized, is kept.

PyBytes_FromStringAndSize(const char *str, Py_ssize_t size)
{
PyBytesObject *op;
if (size < 0) {
PyErr_SetString(PyExc_SystemError,
"Negative size passed to PyBytes_FromStringAndSize");
return NULL;
}
if (size == 1 && str != NULL) {
op = CHARACTER(*str & 255);
assert(_Py_IsImmortal(op));
return (PyObject *)op;
}
if (size == 0) {
return bytes_get_empty();
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment to mention that the object is the empty bytes string singleton if size=0.


/* Prevent buffer overflow when setting alloc to size+1. */
/* Prevent buffer overflow when setting alloc to size. */
if (size == PY_SSIZE_T_MAX) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this test, PyBytes_FromStringAndSize() has a stricter test on size maximum value:

    if ((size_t)size > (size_t)PY_SSIZE_T_MAX - PyBytesObject_SIZE) {
        PyErr_SetString(PyExc_OverflowError,
                        "byte string is too large");
        return NULL;
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it okay if this starts raising OverflowError instead of MemoryError? MemoryError is tested for in test_bytes but would simplify a number of cases if can always rely on the bytes to do the length check.

bytes, because the PyObject_HEAD is inline, has a slightly lower max length than bytearray which did a PyMem_Malloc. That means max bytearray moves from PY_SSIZE_T_MAX - 1 (where need to worry about overflowing a Py_ssize_t more often) to PY_SSIZE_T_MAX - PyBytesObject_SIZE. That's also what leads to + 1 checks no longer being needed because the "bytes" will always fail / overflow before we'd wrap around with a + 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added PyByteArray_SIZE_MAX to simplify these checks

if (size == 0) {
new->ob_bytes = NULL;
alloc = 0;
new->ob_bytes_object = PyBytes_FromStringAndSize(NULL, size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment to mention that the object is the empty bytes string singleton if size=0.

}
if (alloc > PY_SSIZE_T_MAX) {
// NOTE: offsetof() logic copied from PyBytesObject_SIZE in bytesobject.c
if (alloc > PY_SSIZE_T_MAX - (offsetof(PyBytesObject, ob_sval) + 1)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please move PyBytesObject_SIZE from bytesobject.c to pycore_bytesobject.h and rename it as _PyBytesObject_SIZE? Then add #define PyBytesObject_SIZE _PyBytesObject_SIZE to bytesobject.c (to avoid modifying the code).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved, also added a PyByteArray_SIZE_MAX which generally replaces PY_SSIZE_T_MAX to make the checks more precise (and found one case which was still doing a + 1 / - 1 that doesn't need to anymore.

PyErr_SetString(PyExc_OverflowError,
"cannot add more objects to bytearray");
return NULL;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot remove this check, there is n+1 just below which can overflow, no?

Copy link
Contributor Author

@cmaloney cmaloney Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new constant, this passes when I add it locally:
static_assert(PyByteArray_SIZE_MAX + 1 < PY_SSIZE_T_MAX, "Py_SIZE(self) + 1 code may overflow");

(there's a static_assert in the new JIT, but I don't see one anywhere else / don't know a good place to add that if we want it inside the codebase)

PyErr_SetString(PyExc_OverflowError,
"cannot add more objects to bytearray");
return NULL;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot remove this check, there is n+1 just below which can overflow, no?

@cmaloney
Copy link
Contributor Author

cmaloney commented Oct 29, 2025

With a little more tweaking can rely on bytes for _PyByteArray_empty_string (changes from this PR: https://github.com/cmaloney/cpython/pull/1/files); makes __init__ slightly more complicated but simplifies bytearray_resize_lock_held; not sure worth incorporating here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants